August 2022

Technical details

Who am I?

  • Guillaume [gijom] Roussellet
  • guillaume.roussellet@mcgill.ca
  • Office BRONF 555.
  • My research: Fixed Income, Credit Risk, Time Series Asset Pricing Models, Machine Learning, Financial Econometrics.
  • Preferred programming languages: R (mostly) and Python (a bit less).

What are we going to do?

  • Intense training to learn the basics of code and data management.
  • From here…

What are we going to do?

  • to there.

What are we going to do?

  • Lecture 1:
    • (a brief) History of computers.
    • Programming basics: classes.
    • Programming basics: loops and tests.
  • Lecture 2:
    • Data management.
    • Plotting.
  • Lecture 3:
    • Regression analysis.
    • Markdown for documents and presentations.

Why Programming?

How does a computer work?

  • Assume you are opening an Excel spreadsheet. You double-click.
    • Commands are executed by you computer in the background.
    • Document opens and is stored temporarily in the memory (RAM).
    • Excel is an interface allowing you to communicate with your computer and ask it to do computations for you.
    • Computations are performed by the CPU which is essentially a clock.
    • Once the CPU has finished, it sends back to the RAM.
    • Saving writes the RAM into the Disk.
    • Quitting erases the RAM.

Why a coding class then?

  • Computers are machines that speak a bunch of languages.
  • They only do what you ask them to do.
  • Plain vanilla stuff is already implemented, but you’ll have to do more by yourself!

\(\Longrightarrow\) ENTERS PROGRAMMING!

  • We will learn the principles of communicating with your machine!
  • That will avoid you being hostage of machines in the future.

Why Python?

  • Python is very versatile, and the community is strong.

  • It’s easy to find answers when you’re having troubles.

  • It’s free!

  • Alternatives (for finance):

    • R (free, a good alternative!)
    • Matlab (expensive, for maths and finance)
    • SAS (expensive, data management)
    • Stata and EViews (expensive and weird)
    • C++ (free, hard)

A foreword

  • Coding is hard!
    • Best way to learn is to do trial and error.
    • Practice on simple examples at home!
  • You should never say it doesn’t work.
    • You role is to understand what’s going wrong.
    • A computer is stupid, if your syntax and grammar are wrong, no room for interpretation.
    • Google is definitely your best friend for error messages.
  • Useful resources: Python Doc, Stackoverflow, collection of helps

Before we go

Me and my python pet Java

Me and my python pet Java

Installing Python

Requirements

  • You need to install Python, and an interface to communicate with you computer in Python.

  • Programming usually involves:

    • Writing code in an editor to obtain a script.
    • Compiling the script to obtain the results.
  • Doing so is efficient because it allows you to see the complete structure of what you’re asking to the computer.

  • There are many editors and compilers, but we’ll use a simple setup.

    • Our platform will be Anaconda.
    • We will use the editor Spyder.

Anaconda

  • Anaconda can be downloaded on the anaconda website.
  • Installation is standard and explained here in case.
  • Once this is done, run the Anaconda navigator:
Basic Anaconda Navigator

Basic Anaconda Navigator

Launching Spyder

Basic Anaconda Navigator

Basic Anaconda Navigator

Launching Spyder

Updating Spyder

Updating Spyder

Launching Spyder

Spyder Interface

Spyder Interface

The Spyder Interface

Spyder Interface

Spyder Interface

The Spyder Interface

Spyder Interface

Spyder Interface

The Spyder Interface

Spyder Interface

Spyder Interface

How does Python work?

  • Think about Python as a powerful calculator and much more from now on.
  • Examples:
    • Difficult computations like \(\sqrt{\exp(1852)}\), or option greeks.
    • Data exploration (mean, stdev, correlations, etc).
    • Data transformation (cleaning, creating new variables).
    • Text scrapping.
    • Sending emails every hour to your teammates automatically.

How does Python work?

  • Python is OpenSource and there is a huge community.
  • It includes so-called libraries that you can download.

Important coding principles:

  • The community is smarter than me.
  • If I have a problem, somebody had it before.
  • I should use readily available libraries as long as I understand what it does.

Important libraries:

  • Throughout the class we will use the same libraries over: numpy (matrix computation), matplotlib (charts), pandas (basic stats), ScyPi (complex mathematics).
  • More info here.

Beginning with Python

First elements of syntax

  • Creating a simple variable with Python:
my_variable = 42
  • my_variable is a number (integer).
  • Once I send it to the console, it is in the memory and I can call it again.
new_var = my_variable - 5
print(new_var)
## 37
  • Variables are erased from memory when I quit Spyder, but the script lives on!

Simple types of variables

  • Many types of variables can be created.
  • float: numbers with decimals
my_float = 100.0
print(my_float)
## 100.0
  • string: a chain of characters
my_string = "I'll be back"
print(my_string)
## I'll be back
print(my_string[0:7])
## I'll be

Simple types of variables (c’td)

  • list: can mix numbers and strings
my_list = [42, "I'll be back", 100.0]
print(my_list)
## [42, "I'll be back", 100.0]
  • We could also call the created variables:
my_list_bis = [my_variable, my_string, my_float]
print(my_list_bis)
## [42, "I'll be back", 100.0]
  • Checking if lists are the same:
my_list_bis == my_list
## True

Simple types of variables (c’td)

  • dictionaries: a list of named elements (rather than numbered)
my_dictionary = {}
my_dictionary['forty two'] = my_variable
my_dictionary['terminator'] = my_string
print(my_dictionary)
## {'forty two': 42, 'terminator': "I'll be back"}
  • Calling elements of a dictionary:
print(my_dictionary['terminator'])
## I'll be back
  • Comparing with lists:
# print(my_dictionary[0]) => Does not work
print(my_list[0])
## 42

What have we learnt so far?

  • Variables are easy to create, and they are in memory once created.
  • Variables can have different types depending on their use.
  • We can call elements of variables with the squared brackets.

Caution!

  • First element is always numbered 0.
print(my_list)
## [42, "I'll be back", 100.0]
print(my_list[1])
## I'll be back
  • Operations can be called on the fly on already created variables.

More variable types?

  • In our finance curriculum, we will have to play with matrices and vectors.
  • This is allowed in Python with the numpy library.
  • We need to tell our session that we want to use all the stuff located in numpy:
import numpy as np
  • We can construct vectors (succession of numbers):
my_vector = np.array([1.5,2.3])
print(my_vector)
## [1.5 2.3]
  • and matrices (tables of numbers):
my_matrix = np.array([[1,2],[3,4]])
print(my_matrix)
## [[1 2]
##  [3 4]]

Matrix manipulation

  • Knowing the dimension of a matrix?
print(my_matrix.shape)
## (2, 2)
  • Matrix multiplication (please watch this video if you don’t remember)
my_multiplication = np.matmul(my_matrix, my_vector)
print(my_multiplication)
## [ 6.1 13.7]
  • Matrix transpose (can use .T instead):
print(my_matrix.transpose())
## [[1 3]
##  [2 4]]

Create empty matrices

  • Creating a matrix of zeros
mat_zeros = np.zeros((2,10))
print([mat_zeros, mat_zeros.shape])
## [array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
##        [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]]), (2, 10)]
  • matrix of ones:
mat_ones = np.ones((1,5))
print([mat_ones, mat_ones.shape])
## [array([[1., 1., 1., 1., 1.]]), (1, 5)]
  • Identity matrix:
mat_diag = np.eye(2) ; print(mat_diag)
## [[1. 0.]
##  [0. 1.]]

Matrix simple operations

mat_diag + 1; mat_diag - 1; mat_diag * 2; mat_diag/2
## array([[2., 1.],
##        [1., 2.]])
## array([[ 0., -1.],
##        [-1.,  0.]])
## array([[2., 0.],
##        [0., 2.]])
## array([[0.5, 0. ],
##        [0. , 0.5]])
mat1 = np.ones((2,2)); mat2 = np.array([[1,2],[3,4]])
mat1 + mat2; mat1 - mat2; mat1 * mat2; mat1/mat2
## array([[2., 3.],
##        [4., 5.]])
## array([[ 0., -1.],
##        [-2., -3.]])
## array([[1., 2.],
##        [3., 4.]])
## array([[1.        , 0.5       ],
##        [0.33333333, 0.25      ]])

What have we learnt so far?

  • We can construct vectors and matrices with numpy.
  • We can perform the standard computations calling array.function.
  • Some specific objects can be constructed directly:
    • zeros with np.zeros(nrow,ncol),
    • ones with np.ones(nrow,ncol),
    • identity with np.eye(nrow),
    • diagonal with np.diag(number1, ..., number n).
  • Operations are performed element by element.

Caution!

Remain careful about dimensions!

mat_diag + mat_ones
## Error in py_call_impl(callable, dots$args, dots$keywords): ValueError: operands could not be broadcast together with shapes (2,2) (1,5) 
## 
## Detailed traceback:
##   File "<string>", line 1, in <module>

Some practical examples:

  • Evaluate and understand what the following expressions are producing. In case computation cannot be performed, explain why.
2 ** 3 ; pow(2,3)
2 ** 3/4
np.ones([2,2]) @ np.array([[1,2],[3,4]])
[1,'1'] * [3, '4']

a = 4
b = 5
a -= 3 + b

['a', 'b', 'c', 'd'][2]
'Stonks only go up'[5:9]
['Stonks only go up'][5:9]

np.ones([1,5]) + np.diag([1,2])
  • Create a \((2\times 2)\) matrix containing elements (1,2,3,4) starting filling columns. Invert it with the function linalg.inv of numpy.

Best practices

Commenting the code

1. ALWAYS COMMENT YOUR CODE

  • Commenting allows your future self to understand and reuse the code.
  • Commenting allows others to understand what you are doing.
  • You’ll rapidly reach more than 1,000 lines of script.
  • It’s more costly when you write it but it is ESSENTIAL.

How do we comment and what does commented code look like?

  • We can comment entire paragraphs with """ comment """.
  • We can comment end of lines with #.

A typical script:

""" 0. Import libraries """
import numpy as np

""" 1. Building the needed matrices. """
my_matrix = np.array([[1,3],[2,4]]) # This matrix is (2x2)
my_matrix2 = np.ones((2,5)) # This matrix is (2x5)

""" 2. Performing our operations """
result = np.matmul(my_matrix, my_matrix2)

# Once our result is computed, we can perform other operations

Naming the variables

2. CHOOSE EXPLICIT NAMES FOR VARIABLES

  • You won’t remember what X9 is.
  • So choose variable names that makes sense out of context.
  • You can have them as long as you want.

Example:

matrix_example_for_Python_course = np.array()

BE CAREFUL

  • Python is case-sensitive
  • Variables toto, Toto and toTo are all different!

Basic useful commands

Programming 101.

  • There are two basic types of operations that makes programming appealing.
    • Tests
    • Loops
  • Tests allow for the programmer to verify whether a particular condition is met for any object, and apply an operation depending on the case.
    • For instance, you retrieve a lot of company balancesheet.
    • You want to know which company has leverage greater than a threshold.
  • Loops recognize that your data may be large, and you have to perform similar operations row-by-row, or column-by-column.
    • Imagine you have 10,000 companies.
    • You want to perform the leverage test for each company.
  • Tests are called with if and else.
  • loops are called with for or while.

If statements

  • Let us consider my_variable which was equal to…?
  • Let’s test where it is:
my_bool = my_variable > 50 ; print(my_bool)
## False
my_bool = my_variable <= 50 ; print(my_bool)
## True
my_bool = my_variable == 50 ; print(my_bool)
## False
  • my_bool is yet another type of object called boolean, which says True or False.
  • if statements will create booleans to make a decision.

If statements

  • Let us make the computer print different messages whether my_variable is above 9,000.
if my_variable > 9000:
  print("It's over 9,000!")
elif my_variable == 9000:
  print("It's exactly 9000")
else:
  print("I have to stop watching Dragon Ball.")
## I have to stop watching Dragon Ball.
  • For a simple test, put if [CONDITION]: and go to the next line.
  • elif is the combination of else and if to test a second condition.
  • else: gathers all remaining cases.
  • The space on lines after if, elif and else is called indentation
    EXTREMELY IMPORTANT.
  • Code won’t run if you don’t put it.

If statements

  • You can test many types of conditions:
    • Equal: == (to not be mistaken with = which assigns a value)
    • Not equal: !=
    • Less than: <
    • Less or equal: <=
    • Greater than: >
    • Greater or equal: >=
  • You can test joint conditions:
    • more than one conditions are met: if a < 10 and a >= 5:
    • at least one condition is met: if a < 10 or a >= 5:

If statements

  • You can nest conditions:
a = 42
if a > 10:
  print("above ten")
  if a > 20:
    print("and also above twenty!")
  else: 
    print("but not above twenty.")
## above ten
## and also above twenty!
  • the else only concerns the second test. If I defined a = 5, what would happen?
  • Every time you add a new if you need another indentation (won’t work if not anyway)
  • This architecture will allow you to read easily scripts.

If statements

for loops

  • for loops allow you to perform the same operation for each element of an object.
  • It takes a starting and an ending value.
""" Initializing our result vector """
dummy_greater = np.zeros((100,1)) # Dimension is (100x1)

""" Performing for loop """
for i in range(0,100):
  if a >= (i+1):
    dummy_greater[i] = 1
  • What does this code do?

  • range(start,end) consists in all integers from start to end-1.

  • Don’t forget: first element of a vector is numbered 0.

Customizing for loops

  • You can loop on pretty much whatever you want, even strings
Score_teams = {}
Score_teams["MTL"] = 4
Score_teams["VGK"] = 1

for names in Score_teams.keys():
  print(Score_teams[names])
## 4
## 1
  • This is important since the structure of your data will matter.
  • You can loop on variables in your data for instance.

Exercise with for loops

  • Assume you have a bond paying off semi-annually.
  • Its cashflow is two dollars every payment period.
  • Its last cashflow is $102
  • Its maturity is five years.
  • Assuming a year has 364 days, construct the vector of daily cashflows by completing the following code:
# Constructing a vector of cashflows
Final_cashflow_vector = np.zeros((364 * 5))

# Complete the rest 

while loops

  • while loops are a combination of for and if.
  • They perform an operation while a certain condition is true.
  • It is very useful for controlling your P&L for instance: while gains are below threshold, otherwise liquidate position.
  • Syntax is similar to for and if:
import random as rd

Gains = np.zeros(1)
count = 0
threshold = 10 
while Gains[count] < threshold:
  heads_tails = rd.randint(0,1)

  if heads_tails == 1:
    Gains = np.append(Gains, Gains[count] + 1)
  else:
    Gains = np.append(Gains, Gains[count] - 1)
  count +=1 # same as count = count + 1

while loops

import matplotlib.pyplot as plt

plt.plot(Gains)

Getting help

  • There are a lot of things you don’t know how to do in code.
  • Resources: Google and help
help(plt.plot)
## Help on function plot in module matplotlib.pyplot:
## 
## plot(*args, scalex=True, scaley=True, data=None, **kwargs)
##     Plot y versus x as lines and/or markers.
##     
##     Call signatures::
##     
##         plot([x], y, [fmt], *, data=None, **kwargs)
##         plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs)
##     
##     The coordinates of the points or line nodes are given by *x*, *y*.
##     
##     The optional parameter *fmt* is a convenient way for defining basic
##     formatting like color, marker and linestyle. It's a shortcut string
##     notation described in the *Notes* section below.
##     
##     >>> plot(x, y)        # plot x and y using default line style and color
##     >>> plot(x, y, 'bo')  # plot x and y using blue circle markers
##     >>> plot(y)           # plot y using x as index array 0..N-1
##     >>> plot(y, 'r+')     # ditto, but with red plusses
##     
##     You can use `.Line2D` properties as keyword arguments for more
##     control on the appearance. Line properties and *fmt* can be mixed.
##     The following two calls yield identical results:
##     
##     >>> plot(x, y, 'go--', linewidth=2, markersize=12)
##     >>> plot(x, y, color='green', marker='o', linestyle='dashed',
##     ...      linewidth=2, markersize=12)
##     
##     When conflicting with *fmt*, keyword arguments take precedence.
##     
##     
##     **Plotting labelled data**
##     
##     There's a convenient way for plotting objects with labelled data (i.e.
##     data that can be accessed by index ``obj['y']``). Instead of giving
##     the data in *x* and *y*, you can provide the object in the *data*
##     parameter and just give the labels for *x* and *y*::
##     
##     >>> plot('xlabel', 'ylabel', data=obj)
##     
##     All indexable objects are supported. This could e.g. be a `dict`, a
##     `pandas.DataFame` or a structured numpy array.
##     
##     
##     **Plotting multiple sets of data**
##     
##     There are various ways to plot multiple sets of data.
##     
##     - The most straight forward way is just to call `plot` multiple times.
##       Example:
##     
##       >>> plot(x1, y1, 'bo')
##       >>> plot(x2, y2, 'go')
##     
##     - Alternatively, if your data is already a 2d array, you can pass it
##       directly to *x*, *y*. A separate data set will be drawn for every
##       column.
##     
##       Example: an array ``a`` where the first column represents the *x*
##       values and the other columns are the *y* columns::
##     
##       >>> plot(a[0], a[1:])
##     
##     - The third way is to specify multiple sets of *[x]*, *y*, *[fmt]*
##       groups::
##     
##       >>> plot(x1, y1, 'g^', x2, y2, 'g-')
##     
##       In this case, any additional keyword argument applies to all
##       datasets. Also this syntax cannot be combined with the *data*
##       parameter.
##     
##     By default, each line is assigned a different style specified by a
##     'style cycle'. The *fmt* and line property parameters are only
##     necessary if you want explicit deviations from these defaults.
##     Alternatively, you can also change the style cycle using
##     :rc:`axes.prop_cycle`.
##     
##     
##     Parameters
##     ----------
##     x, y : array-like or scalar
##         The horizontal / vertical coordinates of the data points.
##         *x* values are optional and default to `range(len(y))`.
##     
##         Commonly, these parameters are 1D arrays.
##     
##         They can also be scalars, or two-dimensional (in that case, the
##         columns represent separate data sets).
##     
##         These arguments cannot be passed as keywords.
##     
##     fmt : str, optional
##         A format string, e.g. 'ro' for red circles. See the *Notes*
##         section for a full description of the format strings.
##     
##         Format strings are just an abbreviation for quickly setting
##         basic line properties. All of these and more can also be
##         controlled by keyword arguments.
##     
##         This argument cannot be passed as keyword.
##     
##     data : indexable object, optional
##         An object with labelled data. If given, provide the label names to
##         plot in *x* and *y*.
##     
##         .. note::
##             Technically there's a slight ambiguity in calls where the
##             second label is a valid *fmt*. `plot('n', 'o', data=obj)`
##             could be `plt(x, y)` or `plt(y, fmt)`. In such cases,
##             the former interpretation is chosen, but a warning is issued.
##             You may suppress the warning by adding an empty format string
##             `plot('n', 'o', '', data=obj)`.
##     
##     Other Parameters
##     ----------------
##     scalex, scaley : bool, optional, default: True
##         These parameters determined if the view limits are adapted to
##         the data limits. The values are passed on to `autoscale_view`.
##     
##     **kwargs : `.Line2D` properties, optional
##         *kwargs* are used to specify properties like a line label (for
##         auto legends), linewidth, antialiasing, marker face color.
##         Example::
##     
##         >>> plot([1, 2, 3], [1, 2, 3], 'go-', label='line 1', linewidth=2)
##         >>> plot([1, 2, 3], [1, 4, 9], 'rs', label='line 2')
##     
##         If you make multiple lines with one plot command, the kwargs
##         apply to all those lines.
##     
##         Here is a list of available `.Line2D` properties:
##     
##         Properties:
##         agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array
##         alpha: float or None
##         animated: bool
##         antialiased or aa: bool
##         clip_box: `.Bbox`
##         clip_on: bool
##         clip_path: Patch or (Path, Transform) or None
##         color or c: color
##         contains: callable
##         dash_capstyle: {'butt', 'round', 'projecting'}
##         dash_joinstyle: {'miter', 'round', 'bevel'}
##         dashes: sequence of floats (on/off ink in points) or (None, None)
##         data: (2, N) array or two 1D arrays
##         drawstyle or ds: {'default', 'steps', 'steps-pre', 'steps-mid', 'steps-post'}, default: 'default'
##         figure: `.Figure`
##         fillstyle: {'full', 'left', 'right', 'bottom', 'top', 'none'}
##         gid: str
##         in_layout: bool
##         label: object
##         linestyle or ls: {'-', '--', '-.', ':', '', (offset, on-off-seq), ...}
##         linewidth or lw: float
##         marker: marker style
##         markeredgecolor or mec: color
##         markeredgewidth or mew: float
##         markerfacecolor or mfc: color
##         markerfacecoloralt or mfcalt: color
##         markersize or ms: float
##         markevery: None or int or (int, int) or slice or List[int] or float or (float, float)
##         path_effects: `.AbstractPathEffect`
##         picker: float or callable[[Artist, Event], Tuple[bool, dict]]
##         pickradius: float
##         rasterized: bool or None
##         sketch_params: (scale: float, length: float, randomness: float)
##         snap: bool or None
##         solid_capstyle: {'butt', 'round', 'projecting'}
##         solid_joinstyle: {'miter', 'round', 'bevel'}
##         transform: `matplotlib.transforms.Transform`
##         url: str
##         visible: bool
##         xdata: 1D array
##         ydata: 1D array
##         zorder: float
##     
##     Returns
##     -------
##     lines
##         A list of `.Line2D` objects representing the plotted data.
##     
##     See Also
##     --------
##     scatter : XY scatter plot with markers of varying size and/or color (
##         sometimes also called bubble chart).
##     
##     Notes
##     -----
##     **Format Strings**
##     
##     A format string consists of a part for color, marker and line::
##     
##         fmt = '[marker][line][color]'
##     
##     Each of them is optional. If not provided, the value from the style
##     cycle is used. Exception: If ``line`` is given, but no ``marker``,
##     the data will be a line without markers.
##     
##     Other combinations such as ``[color][marker][line]`` are also
##     supported, but note that their parsing may be ambiguous.
##     
##     **Markers**
##     
##     =============    ===============================
##     character        description
##     =============    ===============================
##     ``'.'``          point marker
##     ``','``          pixel marker
##     ``'o'``          circle marker
##     ``'v'``          triangle_down marker
##     ``'^'``          triangle_up marker
##     ``'<'``          triangle_left marker
##     ``'>'``          triangle_right marker
##     ``'1'``          tri_down marker
##     ``'2'``          tri_up marker
##     ``'3'``          tri_left marker
##     ``'4'``          tri_right marker
##     ``'s'``          square marker
##     ``'p'``          pentagon marker
##     ``'*'``          star marker
##     ``'h'``          hexagon1 marker
##     ``'H'``          hexagon2 marker
##     ``'+'``          plus marker
##     ``'x'``          x marker
##     ``'D'``          diamond marker
##     ``'d'``          thin_diamond marker
##     ``'|'``          vline marker
##     ``'_'``          hline marker
##     =============    ===============================
##     
##     **Line Styles**
##     
##     =============    ===============================
##     character        description
##     =============    ===============================
##     ``'-'``          solid line style
##     ``'--'``         dashed line style
##     ``'-.'``         dash-dot line style
##     ``':'``          dotted line style
##     =============    ===============================
##     
##     Example format strings::
##     
##         'b'    # blue markers with default shape
##         'or'   # red circles
##         '-g'   # green solid line
##         '--'   # dashed line with default color
##         '^k:'  # black triangle_up markers connected by a dotted line
##     
##     **Colors**
##     
##     The supported color abbreviations are the single letter codes
##     
##     =============    ===============================
##     character        color
##     =============    ===============================
##     ``'b'``          blue
##     ``'g'``          green
##     ``'r'``          red
##     ``'c'``          cyan
##     ``'m'``          magenta
##     ``'y'``          yellow
##     ``'k'``          black
##     ``'w'``          white
##     =============    ===============================
##     
##     and the ``'CN'`` colors that index into the default property cycle.
##     
##     If the color is the only part of the format string, you can
##     additionally use any  `matplotlib.colors` spec, e.g. full names
##     (``'green'``) or hex strings (``'#008000'``).

Programming like a boss

Going further: functions

  • In your programming career, you will likely feel limited by standard python functions.
  • Think about Excel: it computes sums and averages nicely, but doesn’t have stuff implemented:
    • Calculating distributions (histograms)
    • performing many linear regressions
    • implementing Black-Scholes on many options.
  • You will need to tune your code to your particular aim.
    • Case 1: Somebody already coded that for you (library blackscholesanalytics for instance)
    • Case 2: The thing you want is not available.
  • Python allows you to construct/modify functions that can perform any set of operations you want.

functions: an example with options data

  • I’m loading options data on a day (we’ll cover that next lecture).
print(Option_data.columns)
## Index(['quote_date', 'underlying_symbol', 'root', 'expiry', 'strike', 'type',
##        'open_interest', 'total_volume', 'high', 'low', 'open', 'last',
##        'last_bid_price', 'last_ask_price', 'underlying_close', 'series_type',
##        'product_type'],
##       dtype='object')
print(Option_data.shape)
## (228, 17)

Reducing the data

print(Option_data.head(5))
##    quote_date underlying_symbol  ... series_type product_type
## 0  1990-01-02              ^SPX  ...         NaN          NaN
## 1  1990-01-02              ^SPX  ...         NaN          NaN
## 2  1990-01-02              ^SPX  ...         NaN          NaN
## 3  1990-01-02              ^SPX  ...         NaN          NaN
## 4  1990-01-02              ^SPX  ...         NaN          NaN
## 
## [5 rows x 17 columns]
reduced_Option_data = Option_data[["quote_date", "strike", 
  "expiry","type", "last_ask_price", "underlying_close"]]
print(reduced_Option_data)
##      quote_date  strike      expiry type  last_ask_price  underlying_close
## 0    1990-01-02   275.0  1990-03-17    C           86.88            359.69
## 1    1990-01-02   275.0  1990-03-17    P            0.94            359.69
## 2    1990-01-02   300.0  1990-03-17    C           62.75            359.69
## 3    1990-01-02   300.0  1990-03-17    P            1.38            359.69
## 4    1990-01-02   325.0  1990-03-17    C           39.75            359.69
## ..          ...     ...         ...  ...             ...               ...
## 223  1990-01-02   350.0  1990-12-22    P           16.00            359.69
## 224  1990-01-02   375.0  1990-12-22    C           19.13            359.69
## 225  1990-01-02   375.0  1990-12-22    P           26.13            359.69
## 226  1990-01-02   400.0  1990-12-22    C           10.13            359.69
## 227  1990-01-02   400.0  1990-12-22    P           40.00            359.69
## 
## [228 rows x 6 columns]

Function syntax

  • Let us create a function that computes moneyness
  • Moneyness is an indicator of whether options insure against probable events.
  • It is roughly equal to strike/underlying.
  • You need to do it for every option!
def Compute_moneyness (strike, underlying):
  return strike/underlying

Compute_moneyness(reduced_Option_data["strike"][0],
                  reduced_Option_data["underlying_close"][0])
## 0.7645472490199895
  • We could now apply our Compute_moneyness function to any option.
  • We could also do a for loop to compute the moneyness of all options.

Tuning the function

  • Options are either in-the-money, at-the-money, or out-the-money.
    • For puts: ATM (1), ITM (>1), OTM (<1)
    • For calls: ATM (1), ITM (<1), OTM (>1)
def Compute_moneyness (strike, underlying, opt_type):
  Moneyness = strike/underlying
  if Moneyness == 1:
    result = "ATM"
  elif Moneyness < 1:
    if opt_type == "P":
      result = "OTM"
    else:
      result = "ITM"
  else:
    if opt_type == "P":
      result = "ITM"
    else: 
      result = "OTM"
  return result

Compute_moneyness(reduced_Option_data["strike"][0],
                  reduced_Option_data["underlying_close"][0], reduced_Option_data["type"][0])
## 'ITM'

Functions

  • Functions begin with def
  • Then you define the name of the function
  • You define the arguments of the function in (argument), separated by commas.
  • You send back the result with return.
  • You can send back any type of object.
  • The function is stored and you can call it whenever you want.

Exercises

Compute the price of a bond

  • Discount rate is 5%.
  • Bond pays annually a cashflow of $2.
  • Bond has maturity 10 years. \[ P = \sum_{i=1}^T \frac{cp\times FV}{(1+r)^i} + \frac{FV}{(1+r)^T} \]
Exercise:
  1. Represent cashflows and discount factors as vectors and use np.matmult to compute the price of the bond.
  2. Construct a function that can price any bond, with arguments coupon, FV, maturity and discount_rate.
  3. Considering that each discount rate can be different, construct a function that can accept a vector of discount rates.
  4. Apply your functions on the original bond, on the same bond with coupon rates 1% to 10% by increments of 1pp.
  5. Do the same if the discount rates are 1% for the one-year rate and increase by 25bps for each year of maturity.

Appendix: Same things with R

Basic objects

x_int <- 12 # Integer
x_flo <- 1.5 # float
x_vec <- c(1,5,12) # vector of 3 elements
x_seq <- seq(0, 10, by = 0.5) # vector of all floats from 0 to 10 by step of 0.5
x_mat <- matrix(1:6, nrow = 2, ncol = 3) # matrix (2x3) filled by COLUMNS first

# Lists
x_lst <- list("a" = c(1:3), "b" = "toto"); print(x_lst)
## $a
## [1] 1 2 3
## 
## $b
## [1] "toto"
# Data frames
x_dfm <- data.frame("first_var" = c(5:8), "second_var" = c("a","b","c","d")); print(x_dfm)
##   first_var second_var
## 1         5          a
## 2         6          b
## 3         7          c
## 4         8          d

Basic objects

  • Careful! First element is numbered 1, not 0!
# matrices
#---------
# Zeros:
x_zeros <- rep(0, 10); x_mat_zeros <- matrix(0, nrow = 2, ncol = 3)
# Ones: same principle replacing 0 by 1 in previous expressions

# Diagonal matrix
x_diag <- diag(1,10) # (10x10) identity matrix
# Multidimensional arrays
x_array <- array(0, dim = c(2,3,5)) # (2x3x5) array of numbers

# Transpose matrix
x_transpose <- t(x_mat) # (3x2) matrix
# Inverse matrix
x_inv <- solve(x_diag)
# Matrix multiplication
x_transpose %*% x_mat
##      [,1] [,2] [,3]
## [1,]    5   11   17
## [2,]   11   25   39
## [3,]   17   39   61

Tests and loops

# If tests
x <- 42
if (x > 10) {
  print("x > 10")
}
## [1] "x > 10"
# Ifs can be called on the fly
my_vec <- c(-5:5)
print(my_vec[my_vec < 0])
## [1] -5 -4 -3 -2 -1
my_vec[my_vec < 0] <- NA
print(my_vec)
##  [1] NA NA NA NA NA  0  1  2  3  4  5

Tests and loops

# For loops
y <- 0 
for (i in 1:10){
  y = y + 1
}
print(y)
## [1] 10
# While loops
y <- 0 
while (y < 10){
  y = y + 1
}
print(y)
## [1] 10

Replacing for loops: apply

  • If you are dealing with a table/array, you can apply a function onto one or several dimensions.
  • Assume you have an array organized in (indiv x charac x time) and you want to compute some averages.
my_data <- array(c(1:80), dim = c(4,2,10))

# for each indiv and charac, compute the mean (4x2 matrix).
ts_mean <- apply(my_data, c(1,2), mean) 
# for each charac and time, compute the mean (2x10 matrix).
indiv_mean <- apply(my_data, c(2,3), mean) 
# for each characteristic, compute the mean (2x1 vector).
charac_mean <- apply(my_data, 2, mean) 

print(ts_mean) ; print(charac_mean)
##      [,1] [,2]
## [1,]   37   41
## [2,]   38   42
## [3,]   39   43
## [4,]   40   44
## [1] 38.5 42.5

Functions

x_vec <- c(NA, 1, 1, 1, rep(2,3))
mean(x_vec)
## [1] NA
# My function computes a mean and drops NAs
Mean_drop_NA <- function(vector){
  result <- mean(vector, na.rm = T)
  return(result)
}
Mean_drop_NA(x_vec)
## [1] 1.5
  • You can also call these functions, or functions built on-the-fly with apply
apply(my_data, c(2,3), function(x){mean(x, na.rm = T)})
##      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## [1,]  2.5 10.5 18.5 26.5 34.5 42.5 50.5 58.5 66.5  74.5
## [2,]  6.5 14.5 22.5 30.5 38.5 46.5 54.5 62.5 70.5  78.5

Calling data.frame elements

print(x_dfm)
##   first_var second_var
## 1         5          a
## 2         6          b
## 3         7          c
## 4         8          d
x_dfm[,1] == x_dfm$first_var
## [1] TRUE TRUE TRUE TRUE
x_dfm["first_var"]
##   first_var
## 1         5
## 2         6
## 3         7
## 4         8